LSTM CCG Parsing
نویسندگان
چکیده
We demonstrate that a state-of-the-art parser can be built using only a lexical tagging model and a deterministic grammar, with no explicit model of bi-lexical dependencies. Instead, all dependencies are implicitly encoded in an LSTM supertagger that assigns CCG lexical categories. The parser significantly outperforms all previously published CCG results, supports efficient and optimal A∗ decoding, and benefits substantially from semisupervised tri-training. We give a detailed analysis, demonstrating that the parser can recover long-range dependencies with high accuracy and that the semi-supervised learning enables significant accuracy gains. By running the LSTM on a GPU, we are able to parse over 2600 sentences per second while improving state-of-the-art accuracy by 1.1 F1 in domain and up to 4.5 F1 out of domain.
منابع مشابه
LSTM Shift-Reduce CCG Parsing
We describe a neural shift-reduce parsing model for CCG, factored into four unidirectional LSTMs and one bidirectional LSTM. This factorization allows the linearization of the complete parsing history, and results in a highly accurate greedy parser that outperforms all previous beam-search shift-reduce parsers for CCG. By further deriving a globally optimized model using a task-based loss, we i...
متن کاملKeystroke dynamics as signal for shallow syntactic parsing
Keystroke dynamics have been extensively used in psycholinguistic and writing research to gain insights into cognitive processing. But do keystroke logs contain actual signal that can be used to learn better natural language processing models? We postulate that keystroke dynamics contain information about syntactic structure that can inform shallow syntactic parsing. To test this hypothesis, we...
متن کاملStatistical Parsing for CCG with Simple Generative Models
This paper presents a statistical parser for a wide-coverage Combinatory Categorial Grammar (CCG) derived from the Penn Treebank. The Treebank is translated to a corpus of canonical CCG derivations. We de ne a generative statistical model over CCG derivations and train it on the transformed Treebank. This model is evaluated using Parseval measures and the accuracy of recovery of word-word depen...
متن کاملA* CCG Parsing with a Supertag and Dependency Factored Model
We propose a new A* CCG parsing model in which the probability of a tree is decomposed into factors of CCG categories and its syntactic dependencies both defined on bi-directional LSTMs. Our factored model allows the precomputation of all probabilities and runs very efficiently, while modeling sentence structures explicitly via dependencies. Our model achieves the stateof-the-art results on Eng...
متن کاملDependency Hashing for n-best CCG Parsing
Optimising for one grammatical representation, but evaluating over a different one is a particular challenge for parsers and n-best CCG parsing. We find that this mismatch causes many n-best CCG parses to be semantically equivalent, and describe a hashing technique that eliminates this problem, improving oracle n-best F-score by 0.7% and reranking accuracy by 0.4%. We also present a comprehensi...
متن کامل